2025 Practical Statistics for Medical Research

Interactive R Companion for SPSS Users

Author

Jan Hughes-Austin and D. Eastern Kang Sim

Published

Invalid Date

Welcome to Your R Companion

Important Note: This is a Learning Companion, Not a Replacement

Companion, Not Substitute

This interactive guide serves as a learning companion to your SPSS-based statistics course, not a replacement. While your primary instruction uses SPSS, this resource helps you explore how the same statistical concepts and analyses can be implemented in R.

Why Learn R for Statistics?

R is a free and open-source programming language specifically designed for statistical computing and data analysis. Unlike proprietary software, R offers several key advantages for scientific research:

Reproducibility: R scripts document every step of your analysis, making your research completely reproducible. Anyone can see exactly what you did and replicate your results.

Flexibility: With thousands of packages (libraries) available, R can handle virtually any statistical method or data visualization need.

Introduction to the Tidyverse

The tidyverse is “a collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.” This collection of packages makes data analysis more intuitive and efficient.

Core Philosophy: Tidy datasets are easier to manipulate, model, and visualize because the tidy data principles impose a general framework and a consistent set of rules on data.

The Pipe Operator (%>%): One of the most powerful features of tidyverse is the pipe operator, which allows you to chain operations together in a readable way:

# Instead of nested functions (hard to read)
result <- function3(function2(function1(data, arg1), arg2), arg3)

# Use pipes (reads left to right, top to bottom)
result <- data %>%
  function1(arg1) %>%
  function2(arg2) %>%
  function3(arg3)

This approach makes your code more readable and mirrors how you think about data analysis: “take the data, then do this, then do that.”

Getting Started

Interactive Learning

All code blocks in this companion are interactive! You can modify and run them directly in your browser. This hands-on approach helps you learn by doing, which is essential for mastering both statistical concepts and R programming.


Session 1: Concepts of Measurement

Understanding Variables and Measurement Scales

In statistics, understanding the type of data you’re working with is crucial for choosing appropriate analytical methods. Let’s explore the different types of variables using R and visualizations.

Types of Variables

Visualizing Different Variable Types

Let’s create visualizations that are appropriate for each type of variable:

Variable Classification Exercise

Measurement Error and Reliability

Understanding measurement error is crucial for interpreting statistical results. Let’s simulate measurement error to demonstrate its effects:

Reliability and Validity Demonstration


Session 3: Descriptive Statistics

Overview of Descriptive Statistics

Descriptive statistics summarize and describe the basic features of a dataset so that we can make concise quantitative statements about the data. Let’s explore these concepts using R and the tidyverse.

Creating Sample Data for Analysis

Measures of Central Tendency

The Mean, Median, and Mode

Understanding Skewness Through Visualization

Measures of Spread (Variability)

Range, Variance, and Standard Deviation

Coefficient of Variation Comparison

Descriptive Statistics by Groups

Data Visualization for Descriptive Statistics

Distribution Plots

Box Plots for Group Comparisons

Correlation Visualization

Sampling Distribution Demonstration


Session 4: Statistical Inference and Hypothesis Testing

Introduction to Statistical Inference

Statistical inference allows us to estimate population characteristics from sample data and test theories about the effects of treatments. Let’s explore these concepts using R.

Probability and the Normal Distribution

Sampling Error and Standard Error

Confidence Intervals

Hypothesis Testing Framework

One-Sample t-Test

Independent Samples t-Test

Paired Samples t-Test

Power Analysis

Summary of Tests


This companion continues to evolve. For updates and additional resources, check the course website.